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Abstract. The paper discusses a family of Markov processes that represent 
many particle systems, and their limiting behaviour when the number of parti- 
cles go to infinity. The first part concerns model of biological systems: a model 
for sympatric speciation, i.e. the process in which a genetically homogeneous 
population is split in two or more different species sharing the same habitat, 
and models for swarming animals. The second part of the paper deals with ab- 
stract many particle systems, and methods for rigorously deriving mean field 
models. 

These are notes from a series of lectures given at the 5^ Summer School 
on Methods and Models of Kinetic Theory, Porto Ercole, 201 0. They are sub- 
mitted for publication in "Rivista di Matematica della Universita di Parma" 

1. Introduction 

As the title suggests, these lecture notes consist of two rather different parts, 
although there is one uniting theme: random, interacting many particle systems. 

The first part, dealing with applications from biology, begins with a model for 
sympatric speciation; this is the process in which a population of animals or plants is 
split in two or more separate species, remaining in the same geographical area (hence 
the word sympatric). In this case, the particles are individuals, and the interaction is 
the mating procedure and the selection process. This part is based mostly on [28]. A 
rather different class of models, more similar to the classical kinetic theory of gases, 
are models for swarms (that could be swarms of insects, flocking birds, schooling 
fish, or for that matter, crowds of people). The particles are then the individuals, 
and the interaction is usually the voluntary motion of the individuals, based on their 
visual perception of other individuals in the neighborhood. Such problems have 
attracted a lot of interest in the kinetic theory community recently, and although I 
will give some of the important references, these notes do not give a complete review 
of the current works, only to give another example of how ideas from kinetic theory 
can be applied to biological problems. To a large extent it is based on ongoing 
research with Eric Carlen and Pierre Degond [9]. 

The remaining part of the notes deal with propagation of chaos, which very 
vaguely means that if the particles initially are distributed independently of each 
other in phase space, then they remain independent along the evolution of the 
system. This never holds for interacting particle systems of the kind considered 
here, as long as the number of particles is finite, but for some models it can be proven 
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to hold in the limit of infinitely many particles. It is one of the major challenges 
in kinetic theory to prove that propagation of chaos holds for real particle systems, 
a topic that is discussed in detail in Pulvirenti's notes in this issue |39| . Here 
we discuss a much easier case, where the microscopic model is already random, 
a family of Markov jump processes in state spaces E n that represent n-particle 
configurations, and the corresponding master equations. The propagation of chaos 
can then be expressed in terms of the marginals of the n-particle distributions. 
This approach to the propagation of chaos goes back to Mark Kac ([32]), and 
the key ideas in that paper will be presented. A different approach was taken 
by Griinbaum ([26]), closely related to de Finetti's theorem on the conditional 
independence of exchangeable observations. This approach has been taken a step 
further in [38J, from which much of the material in these notes is taken. And |38j 
was inspired in part by the lectures of P.L. Lions on mean field games (|34|). A 
small section of these notes is essentially taken from one of the first lectures in his 
series. 

What unites these rather disperse topics is that they all deal with Markov pro- 
cesses in a state space E n , i.e. an n-fold product of an Euclidean space E, or 
some submanifold of E" representing e.g. the conservation of energyQ In the state 
v = (vi,...,v n ) <E E™, each component Vk represents the state of one particle. In 
Kac's original model, and other models that represent real gases, the jumps only 
change two components, Vi and Vj, say, simultaneously, although the rate at which 
a particular couple of particles interact may be determined as a function of the full 
state. No deep knowledge of Markov processes is needed to read these notes, only 
a basic understanding of the definitions is assumed. A comprehensive book on the 
topic is |22| . and a standard reference with applications in physics and chemistry 
is 07]. 

2. Applications from biology: a model for sympatric speciation 

There are several mathematical models for speciation that are related to the 
models from the kinetic theory of gases. The one that is presented here comes 
from |28) . but there are many other examples, and I will very briefly mention a 
couple. But first we need to reflect over the concept of species. Although most 
of us have a vague idea of what a species is, it is by no means an easy task to 
make a proper definition. Until at least the 18th century, the flora and fauna 
were thought of as being rather stationary, and a species was characterized by 
producing an offspring of (essentially) the same kind. Notably Linnaeus created 
a taxonomic system for classifying and naming the species, a system that is still 
used today. But it does not really define the concept of a species, rather it gives a 
hierarchical structure, with similar plants, or animals, grouped together. Compte 
de Buffon, contemporary with Linnaeus (both of them were born in 1707), classified 
two individuals as belonging to the same species if they can produce fertile offspring. 
This definition is problematic for several reasons, one being that it is not a transitive 
relation: One could have three candidates for a species, A, B, and C, such that A 
and B can produce fertile offspring, B and C too, but not C and A. It may also 
happen that the result depends on whether A or B is female. The discovery of DNA 
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and techniques to analyze the genetic code has provided new means for classifying 
species, but there is no general definition of "species" that is useful in all situations. 

With Darwin's On the origin of species |16j , a mechanism for evolution was de- 
scribed: due to phenotypic variation within a population, some individuals will 
reproduce less efficiently, and there will be a selection against this phenotypic char- 
acter. But this mechanism is not enough to explain how a species can evolve into 
two different species. A nice discussion on this topic can be found in the introduc- 
tion to van Doom's thesis Sexual selection and sympatric sveciation |46). 

Allopatric speciation may happen if a homogeneous population is split into two 
geographically separated regions, such as two islands. By selection the two sub- 
populations will then evolve to adapt to the local environment, but also phenotypic 
characters that are not selected against will also change, and eventually the two 
sub-populations may be so different that they have become two different species. 
It is much more difficult to understand sympatric speciation, where the two sub- 
population share the same geographical area, van Doom lists a number of obstacles 
for speciation to take place, and exemplifies this with birds feeding on different size 
grains: small, medium and large. The fitness of a bird is quantified in terms of the 
feeding rate. Speciation would now mean, for example, that one sub-population 
specializes in feeding on small seeds and another one on large seeds, but for this to 
happen, the population must reach a state known as "disruptive selection", i.e. a 
situation where the feeding rate could improve by changing a phenotypic character 
(such as the beak length) either in one direction or the other. A first step towards 
speciation is taken if two sub-populations evolve in different directions, leading to 
a phenotypic "polymorphism", but unless the ecological landscape gives an advan- 
tage to the smaller sub-population, only the larger one will remain, and hence the 
polymorphism is eventually lost, and no speciation can take place. 

The next step towards speciation is the evolution of a "reproductive isolation", 
a mechanism that prevents the formation of hybrids. A key concept is that of 
"assortative mating", meaning that reproduction takes place essentially within sub- 
populations: individuals choose mating partners according to specific criteria. Fi- 
nally, some kind of dependence should develop between the genes that are respon- 
sible for the fitness to the landscape (beak length, in our example), and the genes 
responsible for the assortative mating. 

All this is far from completely understood, and the literature on the subject 
is vast. The model for sympatric speciation presented here is one example that 
addresses the question. 

2.1. Adaptive dynamics. One approach to evolution and speciation is adaptive 
dynamics. A recent book treating this is |17| , which states that adaptive dynamics 
is "the long term evolutionary dynamics of quantitative characters driven by the 
processes of mutation and selection", and is a theory that has been developed for 
example by Geritz et al. (24] ■ A short introduction is given in [6]. 

In the easiest case, adaptive dynamics is concerned with the evolution of a scalar 
trait in a monomorphic population. A scalar trait could be for example the length 
of a bird's beak, x > 0, say, and that the population is monomorphic means that 
all individuals have exactly the same beak length. Adaptive dynamics takes place 
on a time scale much longer than the typical time scale of a population, and hence 
it is assumed that the resident population with beak length x is stationary. The 
main issue of adaptive dynamics is to understand what happens to a small group 
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of individuals that (by mutation) has a different trait value, y, say. The initial 
growth rate of the rare mutant population, denoted S x (y), is sometimes called the 
invasion exponent. Because the resident population is assumed to be stationary, 
S x (x) = 0. If S x (y) > 0, then the mutant is more fit, and will eventually take over, 
whereas if S x (y) < 0, the mutant will not invade. The selection gradient determines 
the direction of evolution of the trait: if S' x (x) > 0, then an invading mutant with 
trait y > x has a better fitness, and will replace the resident population. The new 
resident population will have trait value y. Similarly, if S^x) < 0, then the resident 
population will be replaced by a population with smaller trait value. Values of x 
such that S x (x) = are known as evolutionarily singular states, and if it corresponds 
to a local fitness maximum, it is called an evolutionary stable strategy. A resident 
population at an ESS cannot be invaded by a nearby mutant, because all nearby 
strategies are less fit to the environment. Disruptive selection can occur when an 
evolutionary singular state is at a local fitness minimum. The a nearby mutation 
at either side of x has a better fitness, and this is what is needed for speciation to 
take place. 

Another concept is a convergence stable strategy, which is a strategy such that 
monomorphic populations with y close to x can be invaded by mutants which are 
even closer to x. When this is the case, the trait value of the resident population 
will approach x. 

The mutations arrive in a population randomly, and it is not necessarily true 
that the mutants have trait values close to that of the resident population, but 
if the mutants are small, and the frequency of mutations is scaled properly, it is 
possible to derive an ODE which describes the rate of change of the trait of the 
resident population, the canonical equation of adaptive dynamics. 

An approach based on the Hamilton - Jacobi equations can be found in \21\ . 

2.2. Examples of mathematical models of speciation. There are several ex- 
amples of mathematical models for the competition within a population structured 
according to some phenotypic trait. Desvillettes et al. [19] consider the following 
model of logistic type: 

% = (Av)- J Y Ky,y')f(y'w) f ■ 

Here f(y) is a density describing the distribution of the population according to 
the trait y £ y, where y is compact. The birth rate of individuals with trait y is 
a(y), and death rate is 

f b(y,y')f(y')dy' . 

Jy 

The death rate can be seen as a model for competition within the population, the 
function b(y, y') giving death rate of an individual of trait y due to the interaction 
with an individual of trait y' . 

The authors prove global in time existence and uniqueness in L l (y) for this 
equation, assuming sufficient regularity of the functions a(y) and b(y, y'). They also 
present the results of numerical simulations that show how an initially unimodal 
trait distribution evolves into a bimodal, and then multimodal distribution. In fact, 
they even show that a limiting solution must consist of a sum of Dirac masses. 

A different model is provided by Meleard and Tran |36| , where an age structured 
population is studied (this paper is an extension of earlier works by Meleard and 
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co-authors, see the reference list of (36]). In their model the population is described 
by a random measure, 

(Zt,l) 

Z t (dx,da)= ^2 &(x z (t),a t (t)) ■ 



The size of the population is (Z t ,l) 
trait value x £ X and its age a £ K + . 



and each individual is characterized by its 
Each individual produces offspring with rate 



b(x, a) depending on the trait x and age a, and the offspring is born with age a = 
and a trait x = x par + h, i.e. the parent's trait plus a mutation difference h which 
is random and distributed with law k(x, h)dh^ Just like in |19| the death rate has 
a component due to competition between all individuals in the population: 

dtot = d(x, a) + / U((x,a),(y,a))Z(dy,da) 

JU+xX 

In simulations, using birth and death rates 
b(x, a) = x(4 — x)e~ a and 

C(l- 



1 



d to t(x,a) = 



f 



Z(dy,da) . 



ix \ I + v exjj?{-k(x - y)) J 
they find that an initially monomorphic population may evolve into a population 
with a bi-modal trait distribution. However, one of the main objectives of their 
paper is to study the "large population - rare mutation'-scaling. Setting 

n{Z' t \l) 



they prove that Zf —¥ £t £^f, where A4f is the set of finite measures on R + x X . 
Actually, £ £ C{R + ,M F ) and, for all t > 0, / £ C" A {X 7 R) 

(6,/) = <&,/> + 

[d a f{x, a) + f(x, 0)b(x, a) - f(x, a)(d(x,a) + £, s U{x, a))] £, s {dx 7 da) . 



x 

A model that in many ways is similar to the one that is presented in the next 
section can be found in a paper by Dieckmann and Doebeli Actually they 

discuss two different models, of which one is an individual based simulation model, 
the other a model in the framework of adaptive dynamics. The resident population, 
having phenotype x, is assumed to satisfy the following logistic equation: 



dN(x,t) 
dt 



= rN(x,t) 



1 - 



N(x,t) 
K{x) 



where N(x, t) is the size of the population at time t, and K[x) is the carrying 
capacity for a monomorph population with trait x. In |20| K (x) is a Gaussian with 
mean xq and variance cr^-. Due to competition with the resident population, a rare 

C{x-y)K{x)~ 



mutant with trait y will grow with rate : 



1 



K(y) 



; here C(x— y), which 



describes the strength of the competition between a phenotype x and a phenotype 



A natural variation of this would be to consider a mutation rate also depending on the parent's 
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y, is a Gaussian with variance a^. The mathematical analysis in |20| shows that 
evolutionary branching can take place only if ac < &k- 

2.3. A model of sympatric speciation through reinforcement. As explained 
in the introduction, even if evolution brings the resident population to a state of 
disruptive selection, it is often more natural for the population to evolve in one 
direction rather than to split in two viable sub-population evolving in different 
directions. If the latter is to happen there must be some mechanism to favor 
the smaller of the sub-populations. Competition of resources, and specialization 
to particular parts of the available resources may be one such mechanism. For 
the sub-populations to evolve into two different species, some kind of reproductive 
isolation is needed to prevent the formation of hybrids. In [25] we have developed 
a model to study some aspects of this. 

We recall that sympatric speciation means that a population develops into two 
species sharing the same geographical habitat (but usually not sharing the same 
resources) . By reinforcement we mean a process by which natural selection strength- 
ens the separation of the sub-populations. In this model, reinforcement is imple- 
mented via a characteristic trait y describing the appearance of an individual {e.g. 
the color of the tail feathers, the pitch of the song ...), and another trait y* de- 
scribing what characteristic traits in a potential mating partner the individual is 
attracted to. We assume that the traits y and y* are only related to the choice of 
partner, and not directly to the fitness. Fitness, on the other hand, is determined 
by a trait x, which is related to the distribution of food resources in the local en- 
vironment. We show, by simulation, that in this model reinforcement is needed for 
speciation to take place, and that the expected time before the speciation event is 
shorter if the characteristic trait y has more than one dimension. 

Here is the model: 

The population lives in an environment, where food (or other essential resources) 
is characterized by a parameter x € X, and that the food is distributed in space 
according to a density f{x). The trait of an individual in the population has several 
parts: 

• x G X is related to the fitness, its competitivity in collecting the essential 
resource; 

• y € y is a recognizable, characteristic trait, and y* <S y is the individual's 
preference of trait value in potential mating partners. These two traits 
combined yield the probability that a given couple of individuals will mate. 

The size of the population is denoted N, and letting Zk = {xk,yk,Vk) G -Z = 
X x y x y be the phenotype of individual i, the whole phenotype distribution in 
the population is 

z N = (z 1 ,...,z N )ez N . 

The dynamics is time-discrete, and we assume that only the offspring survives from 
one generation to the next. The process can be described as follows: 

(1) Each individual collects food according to its relative fitness, 

(2) It then chooses a mating partner, at random but with a high probability to 
select a partner with a characteristic trait corresponding to the preference. 

(3) The size of the offspring is Poisson distributed with a parameter propor- 
tional to the couple's joint access to the food resource. 
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Offspring depending on access to resource 



Figure 1. A pictorial description of the replicator dynamics of 
the population 



(4) The phenotype of the offspring is the average of that of the parents', but 
mutations are included by adding a (Gaussian) random variable. 

The procedure is described in Figure Q] More precisely 

(1) An individual k has access to a fraction Ck of the available resource: 

e -(x k -x) 2 /2al 



df(x) 

* Ei=iJ x e- {x >- x)2/2a * ' 11/11 



This represents the competition among the individuals. 
(2) Each individual k is given the opportunity to choose a mating partner, and 
chooses j with probabilily 



Prob(j fc = j) 







(J * k) . 
(k = j) ■ 



This is the reinforcement in the model, because it helps forming sub- 
populations such that mating takes place within the group. The parameter 
<7, which we have taken to be the same for all individuals in the population, 
determines the choosiness in selection of partners for mating. 
(3) The couple {k,jk) then produces a Poisson distributed number n K of chil- 
dren, with rate Ck ^ 3k ||/|| , i.e. proportional to the amount the the resource 
that has been collected by the couple. This means that the size of the pop- 
ulation at time t + 1 will be a Poisson distributed variable, N t +i = J2k=i n k 
with a random parameter 



A 



1 



k=l 
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It is random because of the random choice of partner, jk, and the law 
depends on the whole population at time t. 
(4) Each child has a trait z = Zk,i, i = l.-.rik 

z k ,i = ^Y^ + (^,Vi,vt)- 

where (£i,T)i,r]%) are Gaussian random variables. 
Some simulation results are shown in Figure [2j |3] and |4l For the simulations 
in Figure [21 the food resource is concentrated at two points, x = ±1, and the 
population is initially monomorph with phenotype x = 0. Without reinforcement, 
as in (a), the population remains concentrated around x = although the small 
mutations are seen as noise in the distribution. With reinforcement, as in (b), 
(c), and (d), the population immediately splits in two sub-populations, each one 
exploiting one of the food resources. The graph in (d) shows the distribution of 
the appearance trait y. It does also separate in two parts, but there is no reason 
for the parts to stay at any particular position, and therefore these will carry out a 
random walk in the J^-space. Eventually the two branches could meet, which would 
lead to the appearance of hybrid phenotypes. The graph in (d) shows the evolution 
of the "food distribution entropy", 

N t 

W c (t)=J2ck Iog(iV t c fc ), 

k=l 

which is zero if and only if Ck — l/iV* for all k. The simulations show that the 
population approaches a situation where all individuals attract the same quantity 
of the food resource. 

Figure [3] shows the results of a simulation where the food resources are equally 
distributed at the points x = —1,0,1, and as expected, with reinforcement, the 
population will then split in three sub-populations, but it may happen in different 
ways: The figures (a), (b), and (c), (d) show the results of two simulations with 
exactly the same initial conditions. 

And then Figure HJ shows the result in which the food distribution is Gaussian 
in x, with mean zero, and the x-phenotype is initially concentrated at x = 2. We 
can see in (a) how the whole population evolves to x-values close to zero, before 
the splitting into sub-populations take place. The graph in (c) shows the food 
distribution entropy, and the graph in (d) the size of the population, Nt. 

The model can be reformulated into a more mathematically tractable form by 
identifying the population Z t by a point measure in Z: 

(Zt.l) 

Z t <* J2 5 * e M P (Z) (N t = (Z t , 1)) . 

3=1 

To find an expression for Zt+i, we write the offspring from an individual Zj as 

(r(-,^),i) 

r (•, Zj) = 5 Zi (offspring from zj) . 

i=l 

Then the next generation is 

Z t+X = J2 r (•-%•)= / T(-,z)Z t (dz). 
3=i Jz 
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Figure 2. (a) and (b) x- trait without and with reinforcement, 
(c) c-entropy, W c , (d): appearance trait y 



Here T(-,z) is a random point measure in Z, whose law can be computed from 
the description above. The details are given in |28| . From this we wish to write a 
master equation for the process, to find a formula for 



E 



4>{z)Z t+1 (dz) 



z t 




J cf>(z)r(dz,z') 


z t 


Z t {dz') 













To continue, we first write T by conditioning on the mating partner, 



(1) E 



[z)T(dz,z') 



z t 


= [ E 


J <f>{z)T{dz,z') 


Z u z" 


P(z',z")Z t (dz" 




J z 









where the choice of mating partner, Prob(jfe = j), is encoded in 



P(z',z) = 



j z e-\y"-v"\ 2 /^ lz ,^ z , Zt (d2 



The size of the offspring then depends on the resource distribution in the population, 

p -(x-x') 2 /2al 



c{z) 



f(dx') 



f z e-(*"-*'r/^z t (dz») ii/n 



x Jz 
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FIGURE 3. Two simulations with identical parameters. 

(a) and (c): trait x in the population, and (b) and (d): trait y. 



The expectation value in the integral in the right-hand side of equation ([T]) can 
then be computed as as 



E 



{z)T(dz,z') 



Zuz" 



^Prob[(r(dz,z'),l> =k\Z t ,z"} 



fc=0 

The last sum gives the contribution from each child of parents with phenotype z' 
and z"; the phenotype of the child is the average of the parents' phenotypes plus a 
random mutation £ which is distributed with law M(Qd£. The number of offspring 
k is Poisson distributed: 



Prob[(r(cfe,z'),l) =k\Z t ,z"} 



k) 



with 



, c(z') + c(z") f 
k{z ,z ) = - f(dx) . 



In this form, the equations are still not very explicit, but one may formally take 
the limit of infinitely many individuals as in the paper by Meleard and Tran, as 



described above: we let ZT 



, and we assume that this has 
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(c) (d) 

Figure 4. 100 individuals, Gaussian food distribution 



a limit as n — > oo, and even that the limit is given by a density: Z™ — > u t (z) with 
u t = u t (z) = u t (x,y, y*) £ L\X x y x y*) = L X (Z), u > 0, t e N. 

It is then possible to identify the limiting expressions for the food distribution, 
the probability of choosing a particular mating partner, et.c: 



c fe h> c(z;u) 



X 



e -(*"-*') 2 /2°l u ( z ll )dz » 



-df(x') , 



e -\y'~y'\ 2 /2a 2 

$ z e-\y'~y"\ 2 / 2 ° 2 u{z")dz" 1 

t i \ c{z;u) + c{z']u) f 
K k H> k{z,z;u) = / df(x). 



Prob(jA; — j) M> tt(z,z,u) 



Finally we may write the master equation for the limiting densities u n 
z)ut+i(z) dz = 

u t (z')ut(z")X(z' , z"; u)P z >{z"; u)M(z). 

u t (z')u t (z")\{z',z";u) x 



z Jz Jz 



z dz dz dz 



z Jz 



(2) 



P z ,{z";u)M z 



dwdz" 



(z) dz . 
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In the change of variables used to obtain the last line, it is assumed that Z = M. d . 
To conclude, ut+i can be expressed in terms of ut using the expression in brackets 
in the last member of equation @. However, I want to stress that this is only a 
very formal derivation, and also that there are no mathematical results concerning 
e.g. the long time behavior of the model. 

2.4. An averaging process. If we neglect the rather complicated process of choos- 
ing the mating partner, the way in which the food resource is distributed, and how 
this affects the number of offspring of a given couple, the process is a simple one: 
according to some probability distribution, choose a random couple of individuals, 
and replace this couple by an offspring whose phenotype is the average of the par- 
ents' but randomly displaced due to mutations. An extremely simplified version of 
this is the following: 

Consider N individuals with a scalar phenotype x € M. The population is 
therefore described by (xi, ...,xn)- The phenotype distribution is then updated as 
follows: 

• choose a couple (xj,Xk) uniformly at random 

• replace those two individuals with a new couple, as 

/ Xi + xi. xa + xi. 

{x j ,x k )^(-^-±+X 1 ,^-±+X, 

where X\ and X2 are i.i.d. with probability density g(x) S i 1 (IR). 
We now assume that in some limi10 all the are distributed with a law f(x)dx. 
If xq is drawn from this distribution, it should be the result of a replacement, i.e. 

£1,1 + Xl,2 , v 
xo = — — 2 1" x o, 

where also x^i and xi i2 are distributed with law /(x), and where Xo is a random 
variable with distribution g(x). The same argument can be repeated for xi.i and 
£i,2j which then gives (the notation should be clear) 

1 / X 2 ,i +X 2;2 X 2 , 3 +X 2 ,4 \ 

= 2 I ^ - hA 1)2 ]+Ao 

= ^ (X2,l + X 2 ,2 + %2,3 + X-2A + %2,1 + ^2,2) + - (^1,1 + X la ) + X Q . 

The procedure can be repeated, and after n iterations we have 

-if 



2™ 

3=1 



^ 2 n_± 

2 ~ Xn ~ 1 - 



1 

- 2_^x 2 .j + - (-Ai.i + -Xi,2) + x, 



■ 



By the law of large numbers, the first term converges to J" R x/(x) dx when n — > oo, 
and the other terms can be expressed exactly in terms of the distribution g: The 



For a given TV, the distribution should be updated until a stationary state has been achieved, 
and then let N — > 00 
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law of 2 „ 1 _i Y^j—i Xn—i,j ls I 2, n ~ 1 g* 2 (2"" 1 -), for example. This gives a relation 
between the densities / and g, which is most easily expressed in terms of their 
Fourier transforms: 

/(0 = 





9(0- 

If / has bounded second moments, the first factor converges to 1 when n — >• oo 
(this is obviously sufficient to guarantee that the law of large numbers holds), and 
we have then an explicit expression of / in terms of the the distribution g. One 
example where this can be computed explicitly is when g is a Gaussian function 
with variance a; then / will also be Gaussian, but with variance 2a. 

We end this section by writing a master equation for the process, and computing 
expressions for the evolution of marginals. While this doesn't have much to do 
with the model for speciation, it gives an introduction to much of what will follow. 
For a system of N particles, the configuration space is (there is no hardcore 

condition or similar restriction to the particle configuration). Let /n(xi, , Xn) 

be a density in R^. One replacement according to description above, replacing a 
randomly chosen pair of particles by new particles whose position is the average of 
the parents's position plus independent displacements, transforms the density as 

/jvteii ,%n) ^ /w(^i, ,%n) = 

nW^^W n{x " 

(3) g{x 3 - )g( Xk - _L_ ) d Xj dx k 

As is commonly done, we assume that the densities are symmetric with respect to 
permutation of the variables, and it is easy to see that if this holds for fa, then it 
also holds for f' N : symmetry is preserved by the dynamics. The fc-particle marginals 
are defined as 

}N,k(xi,—,Xk)= / fw{xi, ...,Xk,Xk+i, ...,x N )dxk+i ■ ■ -dx N . 

jRN-k 

Because of the permutation symmetry, the same result would be obtained by leaving 
any set of k variables untouched, integrating over the remaining N — k variables. 
Integrating both sides of equation ([3]) over fejt+i, ...,fcjv, we find an expression for 
how the fc-particle marginals are transformed by the replacement process. For k = 1 
and fc = 2 the result is 

fNM = f 1 - l V 1 - T^-t) fN,l(xi) + 



NJ\ N -I 

2 

iV 



fN,2{x' 11 x' 2 )g(x l )dx' 1 dx' 2 



and 
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f' N;2 {xx,X 2 ) = I 1--J [ 1 - j-j—j ) fN,k( X ^ X 2>+ 







*) 


2(AT- 


-2) 


7V(iV- 




2(AT- 


-2) 


JV(iV ■ 




2 




AT(iV- 


-1) 



jN,3(xx,x 2 ,x 3 )g[X2 — )dx 2 dx 3 



fN3{x' 1 ,x 2 ,x z )g{x 1 - Xl Xa )dx' 1 dx' 3 + 

}N,2{x 11 x 2 )g{xi )g(x 2 ^ ) "^1^2 • 

respectively. Here, and in all higher order terms, we find that the expression for f' k 
involves terms with fk+i, so if k < N, the system will never be closed. 

The next step is to let N — > 00. For this to make sense, we write f 3 N k (vi, Vk) 
the distribution obtained after j replacements, and write 

f J N\i ( x i) ~ f°N,i _ f f t j lrJ m , x[+x' 2 
2/N 



(4) : N = / / 



Now, if one thinks of f J N , as being the values of a time dependent function evaluated 

at discrete points, f N>k (vi, v k ) = f N ,k(vi,...,v k ,A t j) with A t = 2/N, the left- 

d 

hand side of equation (|4]) is a finite difference approximation of — Jn,r {vi , ■■■ , v k , A 4- j ) . 
Passing to the limit as TV — > 00 gives 

d f f x' + x' 

g7fi(xi,t)= / f 2 (x' 1 ,x' 2 ,t)g(xi 1 „ 2 )dx\dx' 2 - h(x x ,t) 

If in addition we assume that propagation of chaos holds, that is, f 2 (x^ , x' 2 , t) = 
fi(x' 1 ,t)fi(x' 2 ,t) (this will be discussed at length in the following sections), then a 
closed equation for the one-particle marginal is obtained: 

(5) ^.fi(x u t)= f [ f 1 (x' 1> t)f 1 (x' 2 ,t)g(x 1 - ^A) dx' 1 dx' 2 -f(x 1 ,t). 

Similar equations can be obtained for all marginals, but if the proportion of chaos 
is assumed to hold, then all information is already present in equation ([5])- 



3. Applications from biology: models of flocking animals 

It is fascinating to watch the huge bird flocks flying over big cities, or schools of 
fish that are forming close to bridges, sometimes, and recently there have been many 
attempts to make mathematical models to describe the observed phenomena. What 
is intriguing is that these complex structures can be formed without any obvious 
leader, all individuals in the flock should have the same status. How can the 
presumably rather simple rules controlling the behavior of individual birds result 
in this complex collective behavior? 

In the first section I will present a couple of well known mathematical models 
related to swarming animals (without any claim to give a comprehensive list), and 
then discuss a model which has been analyzed in [5] in some more detail. 
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3.1. The Boids and Cucker-Smale models. The boids model [40] is a system 
of ODEs describing the evolution of N particles: 



Ti = 



E 

3 



[f(n,j)Tij + ai{vj - Vi\nj < r c ) 



-a 2 (Ti 



TV* + 



Here r, is the position of "boid" j. The different terms describe the boid's desire to 
move towards the average position of swarm, to approach the average velocity of 
the boids within a smaller neighborhood, and to avoid crowding. 

There are other models that are discrete in time, and the Cucker-Smale model |14| 
is a particular example for which much progress on the mathematical theory has 
been made recently [27]- Here the velocities Vi(t), i = 1, N, t = 1, 2, 3, ... evolve 
according to 

JV 

■pf^aij ,(vj(n) -Vi(n)) 



Vi(n + 1) — Vi(n) 



1 



(l + \\ Xi - Xj \\2y 

The main mechanism here is alignment, the particles strive to align with the sur- 
rounding particles, and the strength of interaction depends on the distance between 
the particles through the function Ot 3 *. 

A Boltzmann equation inspired by the model of Cucker and Smale has been 
derived by Carillo et al. |12| . They consider a density of individuals, f(x,v,t), 
interacting pairwise by exchanging velocities according to 

v* = (1 — 7a(x — y))v + ja(x — y)w 

w* — ja(x — y)v + (1 — 7a(x — y))w , 

and this leads to a Boltzmann equation 

d 



dt 



+ v V x f(x,v,t) 



Q(fJ)(x,v,t). 



where the collision operator (the binary interactions) is 

1 



Q(fJ)(x,v) 



J 



fix, v*)f(y, w*) - f(x, v)f(y, w) dw dy 



3.2. The Vicsec model and a related Boltzmann equation. A discrete time 
model somewhat similar to that of Cucker and Smale was derived by Vicsec |48[I15). 
and has since been used in a large number of publications. In this model, all 
velocities have the same magnitude, t>o, an d the direction is updated in one of the 
following ways [13] : 



(0) 



v t (t + At) = v (, 



(7) 



Vi(t + At) = Vq (TZr, O 
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Figure 5. The velocity jumps in the model of Bertin et al. 

In both these models, Si is the set of neighbors to the particle i, defined as all 
other particles in side a ball of given radius around Xi, the position of particle i. 
This means that particle i only interacts with other particles inside this ball. The 
function #[•] normalizes a vector: 9(v) = v/\v\. In (O the new velocity of particle i 
is computed by first taking the average velocity of the particles inside the radius of 
interaction, adding a random vector scaled by the number Ni of particles inside the 
ball of interaction, and finally normalizing the magnitude. In (0 the new velocity 
is computed by first finding the average velocity of the particles in the ball (as in 
([6]). normalizing and finally carrying out a random rotation lZ n . The two models 
have been analyzed carefully with respect to e.g. phase transitions in |13| . 
A Boltzmann equation related to the Vicsec model has been derived in [5]: 

^(r,9,t) + e(6) -Wf(x,9,t) = -Xf(r,9,t) + X J* d9> J°° d W (r,)x 

5^ 5(0' + i 1 -9 + 2nm)f(r,9 / ,t) 

771 — — OO 

-f(r,6,t) r de , \e(6')-e(6)\f(r,6,t) 

J —7T 

d9 1 ^ d9 2 r dr 1 p(r 1 )\e(e 2 )-e(e 1 )\f(r,e 1 ,t)f(r,e 2 ,t) 

J — 7T J OO 

OO 

x 6(8 + r] - 9 + 2TTm) . 

ra— — oo 

3.3. A simple kinetic equation on the circle. In order to derive fluid equations 
by the Hilbert or Chapman - Enskog methods, one needs to know the equilibrium 
distributions, and in order to approach this we will now study a simpler, spatially 
homogeneous model in the plane [5], which can be derived from a master equation 
very similar to the one in Section WM, 



7T J — 7T 



dtf(t,9) = I I (/(i,0O/M' + 0*)<?(0-0'-y) 



/3(|sin(0*/2)|) — ^, 

which corresponds to a jump process as depicted in the rightmost part of figure [S] 
two particles with velocities 9\ and 9 2 (so the velocities are represented by angles) 
get new velocities 9'- = # 12 + 9'J, j = 1,2. In this expression, 9'[ and 8' 2 are 
two independent angles distributed with law g(9). Note the similarity with the 
averaging process described in the previous section. The model is also very similar 
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to the model of rod alignment in [4|. It is easy to see that, for all distributions g, 
the uniform distribution f(6) = 1/ (2n) is a stationary solution, but it it may not be 
the only one. Equation ([5]) may be written in terms of the Fourier series of /(£, 6). 

with/(t,e) = Er=-oo^(^ fee , 

= X! a/c-nQn (7fc r (n - k/2) - Y{k)) , 

n 

where 

/""" ia\ d0 t( \ sin <> z ) 

Tfc = / 5(6075- r(z) = — — . 

(r actually depends on the function /3 in ((SJ, as written here it corresponds to 
/? = 1). To study the linear stability of the uniform distribution, we write /(f, 9) = 
1 + e ET=-oo h(t)e ike , and then 

jb k {t) = b h (t) (2 lk T(k/2) - T(0) - T(k)\ +0(s) . 



We then see that up to order s, the Fourier modes are decoupled, and hence the 
linear stability can established by checking the sign of Afc. By a direct calculation 
it follows that Afc < if k > 2, and therefore the stability depends on Ai, which is 
negative if and only if 71 < 7r/4. This in turn depends on g(9). As an example we 
take any density p{x) £ and let. 

g T (y) = 2n V -p(- — — ) 7 fc = p{rk) . 

i — ' r t 

j=— 00 

The parameter r determines how concentrated g is around 6 = 0. Clearly, when 
t — > 0, 71 — > 1 > 7r/4, and therefore the first Fourier mode is unstable for sufficiently 
small r. A similar result can be found in 



4. Propagation of chaos 

Boltzmann's and Maxwell's kinetic theory was derived from a physical point 
of view, and it would take a very long time before a mathematically satisfactory 
derivation was carried out by Lanford [33] for a hard ball gas. And up to date, 
a derivation valid over macroscopic intervals of time is essentially missing (see the 
notes by Pulvirenti for details about this |39| . 

Mark Kac [3T] invented a Markov process that mimics an iV-particle system, 
proposed a mathematically rigorous definition of propagation of chaos, and showed 
that his model satisfies this property. In the following sections, we will present 
Kac's model, and his proof, and then follow through the steps of e.g. Grunbaum [55] 
towards an abstract theorem stating not only that a large class of Markov processes 
do propagate chaos according to the definition of Kac, but also give precise error 
bounds in terms of the number of particles, and a detailed information about the 
limiting equation. The results are proven in |38| and [37]. An important ingredient 
in the abstract formulation is the de Finetti (or Hewitt Savage) theorem, which is 
also presented in these notes, following the lectures by P.L. Lions [34"] . 
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4.1. Kac's approach to the propagation of chaos. Kac's model of an TV- 
particle system is a jump process on the sphere in M. N , 

1 



-.JV-1, 



(VN)) = ^ (v u ...,v N ) 



-{v 2 1 + ... + v 2 N )=N 



Each coordinate represents the (one dimensional) velocity of a particle, and the 
radius is chosen so that the expected energy of a particle (with unit mass) is one. 
The particles suffer binary collisions, which are modeled as jumps involving two 
coordinates at a time: At exponentially distributed time intervals, two coordinates, 
say Vi and Vj are chosen uniformly, randomly, and they are given new velocities v\ 
and v'j'. 

(vi,..., v u Vj, ...v N ) h-> ( Vl , v[, v'j, ...v N ) . 

The new velocities are obtained as a random rotation in K 2 : 9 is chosen at random 
according to a law /i, and then 

(-U-, v'j) — (vi cos(#) — vj s'm(8),Vi s'm(8) + Vj cos(#)) . 

We will use the notation V i-> V = R lyJ (6)V. Clearly 

v? + vf = v 2 + v 2 , 

and therefore this process preserves energy exactly. On the other hand, there is no 
conservation of momentum 

v'i + v' 3 ^ Vi + Vj . 

With only one dimensional velocities only trivial collisions can satisfy both energy 
and momentum conservation. 

The Markov process just described can equivalently be defined by a master equa- 
tion, which describes the evolution of phase space density. Writing V — (v\, vn), 
and ip(V,t) = ip{v\, vn, t), we assume that the random, initial point of the 
Markov jump process is distributed with density tJjq(V). To have a concrete ex- 
ample, we assume that the law for the random rotations in a collision is [i(d9) = 
(2ir)~ 1 d9 (any bounded measure can be treated in the same way, but for singular 
measures, as in e.g. (41], one needs to be a little more careful). The density at time 
t, il)(y,t) then satisfies 

(9) d t ^ N (v,t) = y f (4>N(Rii(0)v,t)-i> N (v,t)) do. 

l<i<j<N n 

The factor in front of the sum in the right hand side implies that the system 
jumps on average N times per unit time, and because the coordinates are drawn 
independently, this means that each coordinate is changed approximately twice 
per unit time. This corresponds to the Boltzmann Grad scaling of a real particle 
system, because each particle should then on average suffer the same (finite) number 
of collisions per unit time, independently of the number N of particles. 

Because all particles in a gas are assumed to be identical, the probability distri- 
bution of initial values should not depend on in which order we write them, and 
this is expressed by saying that the initial distribution should be symmetric with 
respect to permutations: 
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Definition 4.1. The density ipo(vi, ...,vn) is symmetric if for any pair of variables, 

P[(vi, ...,Vi, ...,Vj, ...,v N ) G A] = P[(vi,...,Vj, ...,Vi, ...,v N ) G A] . 

The space of configurations obtained by identifying all points V — (vi, vn) that 
can be obtained from each other by a permutation of the indices, 

(Vi, ...,Vi, ...,Vj, ...,V N ) ~ (Vi, ...,Vj, ...,v l ,...,v N ) , 

is denoted S N ~ 1 (VN))/e N . 

Note that the Kac jump process preserves permutation symmetry. 

In the same way as the Kac master equation corresponds to the Liouville equa- 
tion for a real N particle system, there is a Boltzmann equation for the velocity 
distribution of one particle, that can formally be obtained in the limit of infin- 
itely many particles. This is Kac 's caricature of the Boltzmann equation , the Kac 
equation: 

/oo p7T jn 
\ (f(v', t)f(t, w') - f{v, t)f(w, t)) —dw , 

where just as in the definition of the jump process, 

(V, w') = (v cos(6*) — w sm(9), v sin(0) + w cos(0)) . 

In spite of its relative simplicity, its structure is very similar to that of the 
Boltzmann equation, and the two equations share many characteristics. There are 
numerous studies that consider the trend to equilibrium, tne regularity of solutions, 
its behavior in the presence of external force terms, ... , with the hope that it will 
give insight into the behavior of the the full equation. Some relevant references 
are 0|lli]. 

Mass and energy conservation are among the most important properties of the 
solutions of the Kac equation: 



f(v,t)dv = const, 
f(v,t)v 2 dv = const, 

and also entropy J f log fdv is non-increasing, just as for the real Boltzmann equa- 
tion. On the other hand, the momentum is not a conserved quantity. 

The master equation and the Kac equation are connected through the marginal 
distributions. We define, for k = 1 • • • TV — 1, 

fk(vi,V2, —,Vk,t) = I Vjv(«i, -;V k ,v k+ i, ...,v N ,t)da k , 
Jn k 

where fl k = S N ~ 1 ~ k \ \/ N — vf — ... — v£j , and a k is the uniform normalized mea- 
sure on il k . The marginals f^, the "k-particle distributions" give the distribution 
of one of the first k coordinates, and because of the permutation symmetry, the 
distribution of any collection of k different coordinates is the same, and correspond 
to the joint distribution of k randomly chosen particles in a real gas. Because ipN 
is assumed to be symmetric, the marginal distributions are too. 
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The evolution equations for the fc-particle marginal can be obtained simply by 
integrating the master equation over Vk+i ■ ■ ■ vn- For example, 



<W 



(ii) d t f»( vi ,t)= i / (fi f (v[y 2 ,t)-f?(v 1 ,v 2 ,t))^dv 2 , 



and similar equations can be obtained for all fj? . If we assume that for each k, 
fj* —> fk for some function /fc(i>i, Vk, i), which for each t is a density in M. k , then 
it is possible, at least formally, to pass to the limit in (fTTj) to get 

/OO P7T Jf\ 
J (f 2 (v[,V 2 ,t)-f 2 (v 1 ,V 2) t)) — dv 2 . 

The situation is similar for all because the right-hand side involves fk+i, 
one does not obtain a closed system of equations. The whole discussion about 
propagation of chaos aims at proving that if certain hypotheses are satisfied, then 
fk(vi, ...,Vk,t) = /i(i>i, t) ■ . . . ■ fx(vk, t). The interpretation of this is that drawing a 
fc-tuple of velocities is the same as drawing k velocities independently, the particle 
velocities are independent. This cannot be true for any finite N, but can sometimes 
be proven to be correct in the limit as N — > oo. 

4.2. Propagation of chaos in Kac's model. Kac defined propagation of chaos 
as follows: 

Definition 4.2. A sequence of probability measures i()n(vi, ujv) ; N = l,...,oo 
is said to have the Boltzmann property, or to be chaotic if for each k, 

k 

lim f k N (v 1 ,...,v k ) -> TT lim f?( Vj ,t). 

Assume that the evolution of a sequence of probability measures %l)pf(y\,...,Vjsi,t) is 
governed by a family of Markov processes, and that the sequence is chaotic for each 
t > 0. Then the propagation of chaos is said to hold for these Markov processes. 

One of the main achievements in [3T] was Kac's proof that propagation of chaos 
holds for his model, and hence that the Kac equation can be derived rigorously as 
the limit of a many particle system. 

Theorem 4.3 (M. Kac). Propagation of chaos holds for the master equation fP|]. 
Very briefly, the main steps of the proof are as follows: Seen as an operator in 

l 2 {s n -\Vn)). 

is self adjoint and bounded, and hence ip]y(V,t) = exp(tQ)ipN{V, 0), where 

t k 

k=0 

We then need to compute powers of Q. Consider first a bounded function 
gi(V) = g(vi), i.e. a function depending on only the first component of V , and let 

(gi(vi cos(0) + v 2 sin(0)) - gx(vi))— . 



00 t k 

(13) 1>N(V,t) = Y,-Q k 1> tf {V,0) 
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By recursion, let 

gk+i(vi, ...,Vk,Vk+i) = 

d6 



.,=1 •>-* 



( 15 ) /] / (9k(vi, ...,VjCOs{6) + v k+1 sm(6),..., v k ) - g k {v u ...,v k )) — 



The reason for introducing the g k in this way is that computing ^s n ~' l (\/n) ^N(V)gi(V) 
for all bounded gi(V) is enough to identify the one particle marginal /f^fi), and 
that because Q is self adjoint, (Qty,gi) = (fy,Qgi). The formulae f| 141 15[) then 
appear in the calculation of ( 1 i>,Q k gi). 

Next we assume that the initial data are chaotic, so that ipN (v±, vn , 0) = 
fok( Vl > v n)> an d that there are functions fo ik such that /o,fc(i>i, v k ) = 
limjv^oo fo!k( Vl ' an( ^ moreover 

/fe+i (ui,..., Vk+i)gk+i(vi, —,v k+ i)dvi...dv k+1 

— OO 

OO /-OO 

••• / h(vi)---fi(vk+i)gk+i(vu—,vk+i)dvi...dvk+i. 

— oo J —oo 

Multiplying all terms in (|13p with <7i(wi), integrating and letting N — > 00, we 
get 



fi(vi,t)g{vi)dv 



00 J.fc /-oo 



(16) ^2t\ •■• / /o(«x)"-/o(«*+i)5fc+i( , yi,— 



for < t < 2. Similarly for the two-particle marginals 

00 />oo 

/ f2(vi,V2,t)g(vi)h(V 2 )dv 1 dv2 = 

-oo J — oo 

00 ^fe />oo J.OO 

( 17 ) ^TJ / •■• / /o(«x)-"/o(«Jfe+2)7fc+2(uij— ,U*+2)dUl,— ,dUfe+2. 

Here the 7^ are obtained by iteration: 
72(^2,^2) = g{vi)h(v 2 ) 

7fc+i = X] / (T fc ( Wl ' cos (#) +«fc+isin(0),...,«fc) - 7 fc (wi, v k ))d0 . 
We then need to prove that 

00 />oo />oo />OG 

f 2 (v 1 ,v 2 t)g(vi)h(V 2 )dv 1 dv 2 = / f (v 1 ,t)g(v 1 ) dv\ / /o(t>2, t)h(v 2 ) dv 2 . 



00 — 00 



This is done by proving that the series (|16[) and (|17p are convergent, and by com- 
paring the terms. This involves expressions like 

l3(vi,v 2 ,v 3 ) = g 2 {vi,v 3 )hi(v 2 ) + gi(vi)h 2 (v 2 ,v 3 ) 
j4,{vi,v 2 ,v 3 ,V4,) = g 2 {vi, v 3 )h 2 (v 2 ,v 4: ) +ga(vi, v 3 ,V4,)hi(v 2 ) 

+g 2 (vi,V4,)h 2 (v2,v 3 ) + gi{vi)h 3 {v 2 ,v 3l vi) . 
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As for the convergence of the series, this turns out to hold of a bounded time 
interval, but this interval can be uniformly estimated, and hence one can prove 
that propagation of chaos holds for any bounded time interval. 

4.3. Existence of chaotic states. Kac proved that there is a large class of func- 
tions distributions defined as 



ip N (vi,...,v N ) = 



IliLi c ( w j) da N (wi,...,w N ) 



The easiest examples are the uniform distributions, which are also the equilibria 
for the Kac master equations. If ip;y(V,t) is the solution to equation then 

lim ibM(V.t) = . 

|s w -!(Vaoi 

In this case one can carry out explicit calculations rather easily: To compute the 
limit of a one-particle marginal, let 





if w > V TV 



P(v N > w) = { / do N (vi, ...,v N ) i£—y/N<w<y/N 



Then write the spherical cap as 



if w < -V-/V 



whose area is 
Finally, because 



j(vi,u 2 , ...,v N );vf + v\ + ... + v 2 N = VN ,v n > w\ 

v/VN) 



arccosl w 



(1 -cos 2 (^)) (JV - 3)/2 sin6d6 



u/Vn 



V(^V-3)/2 / (1 - x 2 i N -^' 2 dx 

o/y/N 



/V2 



we may deduce that the one-particle distribution converges to a Maxwellian, and a 
similar calculation can be carried out for the two-particle distribution, and so on. 



5. Empirical distributions 

One difficulty with the approach of Kac is that each iV-particle system has its 
own state space, while the questions of convergence would be more easily stated if 
one could embed all the iV-particle systems in the same space. One approach to this 
was suggested by Griinbaum |26| , who proposed method for proving a propagation 
of chaos result for the spatially homogeneous Boltzmann equation for hard spheres. 
We begin by discussing this in abstract terms. 
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The phase space, or configuration space, for the TV-particles is then E , an TV- 
fold product of a Euclidean space E, or more precisly, in order that the way in 
which the particles are numbered is not important, E N /&n-, the quotient group of 
E N with the symmetric group of TV elements. That means that 

X = (xi,...,x N ) € E N and X = (x!,...,x N ) € E N 

are identified if X can be obtained by a permutation of the coordinates of X. We 
then define the empirical measure associated with X as 

1 N 

3=1 

Here we have introduced the notation V(E) for the set of probability measures on 
E, and Vn{E) for the probability measures consisting of N Dirac measures of equal 
mass. This is slightly at variance with the usual definition of empirical measure 
in probability theory, where the Xj are assumed to be i.i.d. random variables with 
some distribution fi <E T-'iE). 

One important property of ^(E) is that it is metrizable. Metrics can be in- 
troduced in several different ways, two of the most commonly used metrics being 
Levy-Prokhorov metric, and Wasserstein distance. The first of these is defined as 
follows: Let {E,d) be a metric space, and let V(E) the collection of probability 
measures on E. For fi, v € V{E), 

d L p(n,v) = inf {e > | fx(A) < v(A £ ) + e and v(A) < fi(A £ ) + e for all A 6 B(E)} , 

where A £ = {x E E such that mi y& A d(x, y) < e}. 

This definition depends, of course, on the metric d on E. Given a distance on E, 
there is a natural way of introducing a distance on E N /@n- Let X = (xi, x^), 
Y = ( yi ,...,y N ) G E N /& N . Then 

d L p(x,y)= inf 1 <e \ 

u£6iv,e>0 [ N J 

And with this metric, we may then define the Levy-Prokhorov distance on M(E N /&n)- 
Note that this metric scales well with TV. For example, if X = (x, x, x, x) (i.e. 
TV copies of the same x e E) and Y = (0, ....,0), then dip(X, Y) = \x\, inde- 
pendently of TV. On the other hand, if X = (x%, 0, 0), then we always have 
d L p(X,Y) < 1/TV. 

As for the Wasserstein distance, it is defined as follows: Let T(fj,, v) be the 
collection of 7 € V(E x E) such that \i = J E 'y(-,dy) and v = f E ~f(dx, •). Hence 
the 7s are are joint probability measures with [i and /i as marginal distributions, 
and the Wasserstein distance is defined as 

Wpi^vf = inf / d(x,y) p d-f(x 7 y)=ME[\X -Y\ p ] . 

5.1. The Hewitt-Savage theorem. Hewitt-Savage theorem (which is an exten- 
sion of de Finetti's theorem) that is the topic of this section, is relevant for the 
discussion of propagation of chaos, but it is included here also to serve as an in- 
troduction to the rather abstract notation that will be used later. The material is 
essentially taken from |34| . 
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Let /d N £ T(E N /e N )- The marginals $ £ V(E k /<S N ) are defined by 



4>(xi, ...,x k )ii k (dxi, ...,dx k ) = I 4>(xi, ...,x k )n (dxx,...,dx k ,...dx N ) , 

E k JE N 

which should hold for every symmetric <f) G C{E k ). In the following we assume that 
E is a compact metric space, but for example E = l n could be treated in a similar 
manner, although with some extra technical complication. 

We then identify X £ E /&n with an empirical measure as above, 



1 N 




N 

this then yields a natural identification of functions u : E N /&n —> R with functions 
V N (E) -> R: 

u(X) -o- u 

Note that if this identification does not automatically preserve properties like con- 
tinuity, unless som care is taken in chosing the metric on V(E k /&n)- The Levy- 
Prokhorov example given above has this property. 

Next we consider a sequence of functions {un ■ E N /&n — > R} Ar _ 1 - These are 
all defined on different spaces, and hence there is no immediate way of comparing 
the functions, and talking about convergence, et.c. But the identification with 
measures in V{E) provides a mean of doing so: one can say that the sequence upf 
converges if the corresponding sequence of measures converges. 

We have the follwing compactness result: 

Consider a bounded sequence {u^v : E n /&n — 5- R} , with \un\ < C, and let 
uj : R + — > R + be a strictly decreasing function with u>(r) > 0. Assume that 

r— >0 

\u N (X) - u N (Y)\ < u(d LP (X,Y)) . 

What this says is that the sequence is bounded and uniformly continuous with 
modulus of continuity u. Then there is a subsequence un> and U £ C(V(E)) such 
that 



U N '(X!, ...,Xn') - U 




0. 



Of course, in most cases the limiting measure cannot be identified with any function 
un ■ E /&N — > R for a finite value of N, but would are in all cases good examples 
of symmetric functions of infinitely many variables x% € E. 

Next consider the following calculation: with E compact, let fjr £ V(E N /&n), 
N = 1, 2, 3...., and consider the marginal distributions fj,^ — J n N (-, dx k +i, dx^) £ 
V(E k /&k), for 1 < k < N. Because E is compact, V(E k /& k ) is also compact, and 
for every k there is a subsequence (N f ) such that 

/if -> fi k £ r(E k /e k ) . 

By the usual diagonal procedure, it is then possible to extract a subsequence N" 
such that 

V k " -> Mfe G T(E k /e k ) for all k . 
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By construction, the (mO/cLi satisfy 

(18) Hk = \ Hk+i(-,dx k +i) ■ 



J E 

The Hewitt-Savage theorem [29], which is a generalization of a theorem by de 
Finetti, concerns sequences of measures with exactly this property. It is common 
to express this as exchangability: a sequence of random variables, X\, X%, .... is said 
to be exchangeable if for any n > 1 and any a = (<xi, ...,cr„) € G„, the n-tuples 
X\, ■■■,X n and X ai , ...,X (7n have the same distribution. 

Theorem 5.1. Assume that a sequence of measures fik G V(E /&n), k = 1,2... 
satisfies j!8\) . There is n G V(V(E)) such that for all k > 1, 

k 



B (dxi, ...,dxk) = / JJ m(dxi) x(dm) . 
d »=x 



The easiest conceivable example, 7r = me V(E). Then 7r is a measure on 

V(V{E)) that is concentrated on to € V(E), and 

/ik(dxi, ...,dxk) = / TT m(dxi) 6jn(dm) = m(dxi) ■ ■ ■ m(dxk) ■ 

That is: if 7r is concentrated in one point to, the measures derived from 7r factorize. 

Proof of theorem 15.11 (P.L. Lions) [Mj: With E compact, V(E) is a compact 
metric space, and we have constructed functions U € C(T J {E)). One can also define 
polynomials on V(E), and these obviously also belong to C(V(E)). The constants, 
polynomials of degree zero, are in C(V(E)), and to define polynomials of degree 1, 
take <j) e C(E), and let 

to h- > / <f)(x)m(dx) = Pi (to) . 
J E 

If toi 7^ TO2, one can find </> such that J B <f>(x)m\(dx) ^ J" B 4>(x)m2(dx), and hence 
these linear functions separate points in V(E). Then the monomials Pj(m) of 
degree j are defined as follows. Take € C(Ej /&n) and then let 

Pj{m) = / (fij(xi, Xj)m(dxi) ■ ■ ■ m(dxj) . 



From these definitions one may then define polynomials of all orders, and the Stone- 
Weierstrass theorem states that the set of polynomials is dense in G(T J (E)). Eval- 
uating these polynomials on empirical measures to = j Sj=i ^ ■ we nno - 

k 



i=l 



\ 11 ••■ 11 (j> J {x ll ,...,x h ) 



For example, 



P2 (3(^1 + s x* + S x 3 )j = g 12 4>{xi,Xj) 
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In these second degree polynomials there are some terms like ^<p(xi,xi), and the 
same will happen for polynomials of higher degree. However, when k is much larger 
than the degree j of the polynomial, a vast majority of the terms will be of the 
form (^(x^ , ...,Xy) where all the arguments are different. 

Now let {fJ,k}kLi be the measures given in the statement of the theorem, and 
consider 



Hk(dxi,...,dx k ) « / (f>(xi,...,Xj)(j,k(dxi,...,dxj,...dx k ) 



The difference between the left and right terms is due to the presence of terms with 
two or more of the arguments of <fi are taken to be the same Xj, and so vanishes when 
k is large compared to j. The error can be estimated by a simple combinatorial 
argument. And 

(x 1 ,...,x j )fi k (dx 1 , ...,dxj,...dx k ) = / 4>(x!, ...,Xj)iij(dxx,...,dxj) 



because of the relation (|T8|) . 

Next we define a linear functional on the set of polynomials P E C(V(E)) by 
setting 



£(P) = lim / P\yj Xj \ fi k (d Xl ,...,dx k ) 

Then P i— >■ £{P) is positive and £(1) = 1, so £ is a positive, bounded functional 
defined on a dense subset of C(V(E)), and can be extended to all of C(V(E)). Then 
Riesz's theorem states that there is a measure 7r € V(V(E)) such that 

£{U) = [ U{m) Tr(dm) . 

JV{E) 

This measure, tt is the the desired measure. We only need to check that the measures 
/j, k can be obtained from ir as stated in the theorem. To this end, consider 

k 



I f\m{dx l )Tr(dm) eV(E k /& N ). 
JV(E) i.. 



Integrating a function 4>(x\, x k ) G C(E k /&n) with respect to this measure gives 

k 

/ cf>(xi,...,x k ) T m(dxi) n(dm) = / P k (m) n(dm) = £(P k ) 

JE k Jv(E) i=1 JV(E) 

(xi, ...,x k )ii k (dxi, ...,dx k ) . 



'E k 

and this completes the proof. □ 

By a short calculation we can now establish that if ir gives rise to measures that 
factorize, then 7r is a Dirac measure: 

Proposition 5.2. Assume that tt € V(V(E)), and that 



m(dxi)m(dx2)Tr(dm) = / m{dxi)-K{dm) j m(dx2)ir{dm) 

V(E) \Jv(E) J \Jv(E) j 

Then there is fh G V{E) such that it = S r7l 
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PROOF:(Lions) [34j . Multiply by (f>(xi)4>(x2), 4> £ C(E), and integrate. Then the 
lefthand side is 

(p(xi)m(dxi) I I / <fi(x2)m(dx2) ) n(dm) = j I / <ft(x)m(dx) I ir{dm) 
V(E) \Je J \Je J Jv(e) \Je J 

and the right hand side 

)2 

But Jensen's inequality states that 

2 ■ 2 



)(x)m(dx) I 7r(dm) < / / <p{x)m{dx) I Tr(dm) , 

/■P(-E) V-E / / Jv(E) \JE J 

with equality only if J E 4>(x)m(dx) = c(0), i.e. independent of m on the support of 
ir(dm). It follows that tt = 6m- O 

6. Estimates on the propagation of chaos for iV- particle systems 

Kac's approach to the propagation of chaos concerns a very simplified model, 
and it is not a trivial matter to extend it to more realistic models. For example, his 
model is essentially Maxwellian, which means that the collision rate of two particles 
does not depend on their relative velocity. Griinbaum |26| circumvented some of 
these problems by a more abstract approach based on identifying an A^-particle 
configuration (vi, Wjv) £ M. 3N with an empirical measure Ylj=i <^p very much 
like in the discussion about the Hewitt-Savage theorem. Some other works in the 
same direction are the results on statistical solutions to the Boltzmann equation 
that can be found e.g. in [2]. 

In Grunbaum's terminology, the set V(E N is convex, and its extreme points 
are exactly the symmetric Dirac measures, X) ^(v ai ,...,v aN )- Any point (3 £ 
V{E N /&n) can be expressed as the barycenter of the extreme points: there is a 
measure such that j3 = J EN X£lp{dX), and his work is based on an analysis 
of the evolution of flp under the collision process. 

The remaining part of this paper is a summary of the results in [35] and in [37) . 
where the method of Griinbaum is rephrased, and new quantitative results on the 
rate at which the propagation of chaos is achieved with an increasing number of 
particles. 

6.1. The abstract setting. To formulate the main results in [35] and in [37], we 

need to introduce a number of spaces, operators on the spaces and maps bestween 
the spaces, as shown in Figure [5] 

We consider a family of Markov jump processes, N — 1,2, 3, on the spaces 
E N The space E is a locally compact, separable metric space. For every N 
there is a process X t = (xi t, XM,t) and a propagator so that 

Xt = X , 

and because we want to be able to identify X t and X t if the components of X t can 
be obtained as a permutation of the components of X , we ask that commutes 
with permutations of the components. This is the microscopic description, each 
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pN ► S?\A > T t N \G N 

Ar ^ master eq. Q .. duality 

'vvvvv /^sym 

{E N ) ---> C^) 



Ax 



1 duality 
V N (E) c V{E) -\/\/\/\/v> V{V{E)) ---> C{V{E)) 

V u 

5^° > 2^°° I 

push forward * 

Figure 6. A summary of spaces and their relations. Semigroups 
are in most cases given together with their generators, as in S^\A. 



componenent (xi) t € E representing the position of one particle in the TV-particle 
system. 

Two most elementary examples are the Kac model, and Griinbaum's model of 
the three dimensional Boltzmann equation. The techniques developed here works 
also in many other cases, and some more examples are given later in this paper. In 
the diagram in Figure [6j the phase space, and the propagator are shown in the 
upper left corner of the diagram. 

The Markov processes can be descibed by the master equation (or Kolmogorov 
equation): Let pf = C(Xt), i.e., 

P[X t eAcE} = [ P ?{dx) . 

J A 

There is a semigroup Sf : V S ym(E N ) -> V S ym{E N ) such that pf = S^ptf. This 
semigroup has a generator A, so that p t satisfies 

d t p» = Ap». 

This is represented in the middle, upper part of the diagram. There is also a dual 
semigroup T t N : C(E N /©jv) — > C(E N /&n) with a corresponding generator G N , 
shown in the upper right part. The two semigroups are related as follows: For all 

P N eV{E N ), cf>€C(E N ), 

( P N ,T t N (4>)) = (s»(p N U), 

and, with <p t = T^cj), 

d t cf> t = G N <t>t, 0o = 0. 

Thus the upper part of the diagram represent the iV-particle system in three 
different ways, essentially equivalent. In kinetic theory we are intersted in rigorously 
deriving the Boltzmann equation as a limit of an TV-particle system, and in Kac's 
work [31j . this corresponds to derving the nonlinear Kac equation from his TV- 
particle model. In this abstract setting we assume that there is a formal mean field 
description, and equation that governs the evolution of a one-particle distribution, 
Pt G V(E): 



(19) 



dtPt = Q(Pt), 
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and typically this is a nonlinear equation. For the purpose of this paper, we require 
that the initial value problem to equation (|19[1 has a unique solution for initial data 
Po G V(E) or some subset of V(E). The solution is represented by a semigroup, 
p t = S^°po- We see this in the lower left of the diagram. The lower part of the 
diagram thus concerns the limit as N — > oo in the TV-particle system, and it also 
provides the arena for comparing the solutions to the ./V-particle system and the 
Boltzmann equation that is the formal limit. And the objective, is to prove that, 
given that certain conditions are satisfied, the one-particle marginals converge to 
the solution of equation (flT)|) : 

Pi,t= p?(-,dx 2 ,:;dx N ) >p t . 

Jjjn-i iV— >oo 

In order to proceed with this, we first need to represent an TV-particle configuration 
Xt = [xi,t, ■■■■,XN,t in T'{E) in the lower part of the diagram. This representation 
is provided by the map fi N , which takes a point X as an argument, and returns a 
point measure: 

N 

If X is random, distributed according to a law C(X) = p^ G "Psym(^ Ar ), the re- 
sulting measure fi^ is random with a distribution which, in the diagram, is denoted 
jj.p G T > (V(E)), as indicated in the middle column. 

In the same way that 'Psym(-E') is related by duality to the set of continuous, sym- 
metric functions, here denoted C(E N ), there is a duality relation between V(T'(E)) 
and C(T'(E)). Clearly the exact properties of this duality depends strongly on the 
topology on V(E). 

The maps 71-^ and R between C{E N ) and C(V(E)) are defined through jl 1 ^ as 
follows: For * G C(V{E)), 

n N :*^il>(x 1 ,...,x N ) = ..,*„)■ 

That is, given X = (xi, ...,Xn), the argument of xp, we get a measure jj,^ x G 
T-V(-E), and this measure is then taken as an argument when evaluating ty. 
Conversely, given 4> G C(E N ), the function * = R(jy g C(V(E)) is defined as 

V{E)3ii^ I 4>(x!, ., , , .x N )fi(dx 1 )^(dx 2 ) ■ ■ ■ ■ ■ n{dx N ) . 

J E N 

In the terminology of Section [SJ R<f> is a monomial of degree N . 

The last objects in the diagram are T t °° and G°° . The former is the push forward 
of Sf. For * G C(V{E)), T t °°* is defined by V{E) ^(Sfn), and G°° is its 

generator. Note that here T t °° is a linear semigroup. 

The relation between the non-linear semigroup S%° and the linear semigroup 
T t °° is simular to the relation between a Hamiltonian, finite dimensional dynamical 
system, and the corresponding Liouville eqation. Consider a (deterministic) system 
of ODEs in R", (e.g. Hamiltonian): 



(20) 



r x = f(x) 

\ x(0) = x Q 
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with solution x(t) = S t x(0). The Liouville equation states how a phase space 
density &t(x) is transported by the flow F: 

$ t (x) =$ (S- t x) :=T"* (x). 

Here <S>t(%) is given explicitly in terms of $0j but the expression involves S* - ', and 
hence is not valid if equation (|20p cannot be solved backwards. The remedy is to 
study the dual problem: Take tp £ C(R n ), multiply and integrate: 

*o(S-*x)<p(x)dx= ( MvMS t y)dy= [ Mv)T l f{y) dy . 



The linear semigroup T is here defined through the forward evolution of x. In 
Figure [6l the Boltzmann equation is the deterministic dynamical system, but in 
the phase space V{E). A phase space density in V(V(E)) is transported by the 
flow via Sf t , but in general is not reversible, and therefore it may be only the 
dual representation that makes sense. 
Solutions to the equation 

<9 t * = G°°* 

in C(V(E)) are known as statistical solutions to the Boltzmann equation, and have 
been studied for example in [5J. 

6.2. The main result, and important hypotheses on spaces and operators. 

Like in Section 14.21 the TV-particle system is represented by a family of master 
equations, one for each N. That means that for each N we consider 

Pt — °t Po ' 

where pg £ P(E N /& N ). The formal limit as N — > oo is given by the Boltzmann 
equation, whose solution is 

Pt = S? Po , 

where p £ V(E). 

Theorem 6.1. (Very informally) There is a constant C(k,£) > only depending 
on k and £ such that for any (p = tpi (8) ip2 ® • • • ® if e with N > It: 

(21) sup ((s?(j$)-(S?'(po)) 9N ),<p) <C(k,£,N)^OwhenN^oo 

[0,T] ^ V ' ' 

What this says is that if we compare the solution to the iV-particle master equa- 
tion with an iV-fold tensor product of the solution to the limiting Boltzmann equa- 
tion, only through the distribution of the first £ particles, the difference decreases 
as N — > oo, and we can compute the rate explicitly. 

Obviously the statement cannot be true in this generality. To begin with, we 
must of course make very precise the statement that the Boltzmann equation is the 
formal limit of the A^-particle system, both in terms of the equations and in terms of 
the initial data. The proof can be seen as perturbation result, where the iV-particle 
systems are treated as perturbations of the limiting equation, and because of that, 
the nonlinear semigroup Sf° must satisfy a rather strong regularity condition. 

In addition to this, because the actual estimates are carried out in the framework 
indicated by the Figure [6j and for everything to work, we must be very precise when 
defining the spaces. In particular, the test-functions <p in equation (|2f \ must be 
taken from £ {T\ n T% H J 7 ^)® 1 , where the Fj are subspaces of C(E) which are 
defined below. 
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So here are then the four main hypotheses on the abstract semigroups and the 
spaces they are acting on. While seemingly complicated, they can be readily verified 
in some relevant cases, and and example of this will be given later. 

(HI) Convergence of the GENERATORS: There exists some integer k > 1, 
9 € (0, 1] and a space T\ C C(E) such that 

V$£^ 9 (4 || (G N tt n — ir N *\\ Lao{EN) <e(N)\m c ^ { ^ 

for some function e(N) going to as N goes to infinity. Here T x is the 
dual of Fx, and V{E) C T' x , and C k > e {T' x ) C C(V(E),T[) denotes the 
set of Holder differentiable functions on V{E). This must be defined, of 
course. C(V(E), F[) is the set of contiuous functions on V[E) defined with 
the topology given by T' x . 
(H2) Differential stability of the limit semigroup: We assume that for 
some affine space J-2 C T\ such that F' 2 is a Banach space, the flow 
on P(E) is C k ' e {J : ' 1 , F' 2 ) uniformly on [0,T], for some integer k > and 
6 E (0,1): there exists Cf? > such that 

sup ||Sr||cw(^) < Cf. 

[0,31 

This implies, for example, that the associated pushforward semigroup T t °° 
maps C k ' e (F 2 ) into C fe - e (J"{). Also this hypothesis relies on a stringent 
defintion of Holder differentialbility in these spaces. 
(H3) Weak stability of the limit SEMIGROUP: There is a space T 3 C C(E), 
such that 

V/x, v G P(E) sup dist^/ (5 t °°(/i), S t °>)) < dist^(/i, i/). 

[0,T] 

In other words, T t °° propagates the C 0,1 ^^) norm. 
(H4) Compatibility of the projection: We assume that the dual ofT 3 , T' 3 
satisfies: 

K(*)IU f = ||# oAj^ <c w n$ii C o, 1(n) • 

Hence the space and its dual are defined so as to give the maps between 
C(E N ) and C(V{E)) as shown to the right in Figure [5] good properties, and 
this is also why (H3) is needed in addition to (H2). 

With the definitions implicitly given in these hypotheses, it is possible to express 
the constant in Equation (|21[) in more detail: 



C(k,£,N) =C{k,£) 



+ TC?e(N)M^® iL ~y-* 

+ CV Ct' w dist^ (po,pf N ) ||^||j7 3( g,( ia o)/-i + C?' w ^n 3 (po) IM!^®^ 00 )'- 1 
where 



n$(po) 



dist^(^,p ) P f N (dX). 
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The first term is related to the converegence of the generators (as in (HI), the 
second term to (H2). The third term simply says that the initial data to the N- 
particle system must be close to a tensor product, and the last term, finally, that 
the initial data pg must be well approximated by an empirical distribution. 

This means that if distjn (pQ,p® N ) > and f2^ 3 (po) > prop- 

agation of chaos holds, with an explicitly computable rate depending on e(N), 
dist n (p$,pf N ) and nJ(po). 



6.3. Differential calculus on V(E). The requirements of the semigroups, as given 
in (HI) and (H2) above are expressed in terms of differentiability of functions func- 
tions $ : V(E) — > K and semigroups : V{E) -> V(E). The exact definitions 
are given in this section, together with a couple of examples. 

Definition 6.2. Let Gi be an affine metric space and G2 Banach space, and let 
J\A° (Gi,G2) oe the set of bounded j -multilinear maps from Gi to G2- We say that 
ip : Gi — > G2 belongs to C > (Gi,G2)> the space of functions k times differ entiable 
with 9 Holder regularity from Gi to G2, if there exist D 3 ip : Gi — > -M J (Gi,G2) such 
that 



V>, v e Gi 



3=0 



< C dist^j (p,, v) 



k+e 



The norm is 



1 0.0(61,62) 



62)) 



3=1 



sup 



^)-E?=oW(M),^-M) 8i > 



dist^j (//, v 



ik+8 



In this paper, G2 is either K or a subset of C(V(E)). Note that for 9 
continuity is not required: C°'° = L°°. 



= 0, 



As a first example we show that polynomials are differentiable. Take T = Lip(E), 
and J-' = (V(E),dLi P ), where the Lipschitz distance is given by dr,i P {\i,v) — 

su P0 G Lip(i3) (Je^KM^) ~ v i dx )) ■ II^Lip = 1 }- A monomial of degree k in 
C(V(E)) is defined by 



(j. H- Pk(jJt) = / 4>{xi,...,x k )iJ 1 {dx 1 )- ■ ■ i±{dx k ) , 

JE k 
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where ip £ Lip(E k ). We first compute Pk{v) = Pk((J,+ (u — /x)): 



Pk{v) = I ip(xi, x k ) (fJ-i + {vi - hi)) ■■■ (fi k + {vk - ilk)) 
JE k " . ' 

= Pk{H) 

V / ip(xi,...,x k )lJ,i---Hj-it J 'j+i---l J 'k(vj- fij) 

,_i JE k 



Ti 



+V / ip(xi,...,x k )\pi---(J,k]i,j(vi-IJ>i)(vj-tJ>j) 



Ti 

H 

Here \pi ■ ■ ■ Hk]i,j = Mi " " ' Mi-iMi+i ' ' ' Mj-iMj+i ' ' ' f^k, an d T\ represents the first 
term in a Taylor expansion, T2 the second etc. The first term, T\ can be rewritten 

k 



V / ip(x 1 ,...,x k )lJ,i--- Hj-iHj+i--- Hki^j ~ Hj) = 

k 

l52 ip(xi,-, y ,-,Xk-i)fif- iik-x{v{dy) - n{dy)) 



7 — 1 ' 

■> posj 



Pfc-i(w) 

where Pk-i(n,y) is a polynomial in fi, of degree — 1, parameterized by y, and 
■Pfc-i (/-*;•) € Lip(iJ). As a function of y £ E it is Lipschitz continuous and 
Pk-i\p; ■) £ M 1 {V(E),R) by duality. Finally 

\Pk(u) - Pk<Ji)\ < ||flk-l(w)llLip rf Lip(^)- 

Therefore these polynomials are once differentiable, but as with polynomials in R", 
the calculations yield polynomials of a lower degree, and therefore it is possible to 
differentiate again. 

The second example is directly related to the propagation of chaos estimates, 
and shows that T t °° is differentiable in t. Take $ £ C l ' e {P) and p £ P. Then, by 
definition 

(G°°<£)(p ) := j t (T t °°$)(Po)\t=o, 
and, from the diagram, (T t °°<l>)(po) = <&(S%°p ) = $(pt)- Therefore 
(G~*)foO = = Hm*M^2l 

= lira f /t;$N, + O ^^(P*^) 1 " 1 



t-to I \ t I \ t 

D$\p ],^\ t = \ = (D$\p },Q(p )). 

Here we have used the definition of differentiability for VP, and the arrive at a 
formula for G°° in terms of Q the generator of the nonlinear semigroup 5°° . 
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6.4. Proof of the abstract theorem. The purpose of this section is to prove the 
estimate (|2T|) . but many of the details are left out, as they can be found in [38] and 
in [37]. 

Take ip e (F\ PI T% H J-?,)® 1 ■ Then (pH]) can be split in several terms as 



< 



< 



(S?{p$),<p ® (S?(rf), m o Ax) I 

+ |(^,T t iV (i?M o Ax)) - (Po, (T t °°RM) o Ax)) I 

+ 1<^, (T t °°i?M) o Ax)) - (p?". m°°i?M) ° Ax))| 

+ \( P f N , (TTRM) ° Ax)) - <(s t °°(po))®', v)| =: Ti + r 2 + r 3 + r 4 . 

Of these terms, 7i is controlled by purely combinatorial arguments, but the other 
terms depend on the hypotheses stated above. Thus the the consistency estimate 
(HI) on the generators plus the fine stability assumption (H2) on the limit semi- 
group gives an estimate of 75, and the 75, involving the chaoticity of the initial 
data depends on measure stability assumption (H3) on the limit semigroup, and 
the compatibility condition (H4) on ir N . Finally, 7~t is controlled in terms of the 

function Q^f (po) (measuring how well po can be approximated in weak T' z distance 
by empirical measures), and also this estimate relies on the weak measure stability 
assumption (H3). 

Estimate of T\: For this term, 

Ti := |<Sf (pff), <p ® l 9N -<) - (S?(j$),R[<p] o Af) | ■ 

Here ip(g)l® N ~ e is the function (x\, ...,xjv) *-> 4>(x\, .xi, and because of the sym- 
metry of S^(pq) it can be replaced by the symmetrized version (ip <g> 1® ) s TO , 
which is obtaind as a normalized sum over all permuations of the variables x\, ...,xn. 
Also, 

R[tp] o A^ = ir N R[(f)] = / <t>{vi, —,yt)P>x{vi) ■ • • p>x{vt) , 

and therfore, by estimating the number of terms with I different coordinates x^ in 
<j>, we find 

P IMk«(£«) 



ViV > 2£, 



\ r / sym L ' J 



< 2 



N 



for any (p £ Cb(E e ). It is essentially the same calculation as in the proof of the 
Hewitt-Savage theorem in Section [5] 

Estimate of 75 : Here we wish to prove that, for any t > and any N > 2£ 

T 2 := \ (pq (R[(p] o Ax)) - (p$, (Tr^M) °Ax")>| 

<C(k,l)CFe(N)\\tpy*® {L ~y-*, 

where C(k,£) is a constant depending only on k and £. The proof is based on the 
following calculation, in which the role of the generators of the semigroups is made 
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visible: 

T t 7T -jt Tj = - — [T t _ s 7T T s ) ds = T t _ a [G tt - tt G J T s ds. 

From the hypothesis (HI) it follows that, for any t G [0, T] 

\\P»v»R[<p]-*»TrRfr])\\ L „ {E!ir) 

< Te(N) sup \\T™Ri<p}\\ ck , e(r , y 
se[o,T] 11 

The next step is to estimate T£°(R[ip\) = R[lp](S?°-) G C fc ' e (J r {), and that 
computation is carried out using the differential calculus developed above, and the 
fact the chain rule applies also here. The details can be found in |38| . 

Estimate of T3 Take t > and N > 2£. The desired estimate here is 

T 3 =\(( P z- P r),(TrRM)°$)\ 

<dist^(p^ P H ||(Tr%>D°Aflu* ■ 

But using first (H4) and then (H3) gives 



|(T t °°%>])o££||^ = \\n N (TrR[^})\\^ N 



and the calculation is completed with ip = ipi <8 
Estimate of 74: Here we need to estimate 



\R[<p]\\c°. 



ipe G T® 1 , k G N, 6> G (0,1]. 



T 4 := 



7i,i — Ti 



4.2 



for £ > and N > 2£. The first term can be written 



T 4 ,i = 



J EN (n a ^ X )) Po (dX 1 ) ... Po (dX N ), 



with 



ai = adX):= / f l (w)S^(fi^)(dw), i = l,...,£, 

J E 



and similarly 

T 4 , 2 = 

with 



j EN (n 5 *) Po{dXx) ... Po (dX N ), 



b t := / <pi(w)S?{jpo){dw), i = l,...,l 

J E 

A small calculation gives 



OiW-hilpoCdXi) ...po(dX N ). 
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and finally, using (H2), for any 1 < i < I 
\ ai (X)-bi\ := 



' tptiw) {S?(p )(dw)-Sr((i%)(dw)) 

E 

°°> w II, -.11 _ _. r.N\ 



which completes the proof of Ta, and also the proof of Theorem 16.11 under the four 
hypotheses on the involved semigroups. 

6.5. Applications of the abstract theorem: the Boltzmann equation. In 

this section shall se how the abstract theorem can be applied to the Boltzmann 
equation with bounded collision rates. In that case, the result of Griinbaum |26) 
can be applied, and so the only new result that can be deduced from the abstract 
theorem are the explicit error bounds. 

Some other examples are treated in [38] and in |37| . for example 

• The McKean-Vlasov equation 

• The Boltzmann equation with certain classes of force fields. 

• The Boltzmann equation for (e.g.) hard spheres 

The last example, which is treated in [37] . actually requires some rather technical 
modifications of the abstract theorem to handle weighted spaces. All details of this 
are given in |37| . 

The objectiv is then to derive the Boltzmann equation, 

dtPt = Q(pt,Pt) 
Pt=o = Po , 

in this case with with p t G 7 , (IR 3 ), and with the right hand side defined by 
(Q(p,p),tp) ■■= / 7(|u>i - w 2 \)b(0) {<f>(w%) - (t){w 2 )) da p(dwi) p{dw 2 ) . 

JE^xS 2 s v ' 

B(w± — W2,6) 

which should hold for any ip £ C (R d ), for any p £ P(R d ), with 

„ Wi+W 2 \w 2 ~Wi\ Wi+W 2 \w 2 ~W\\ 

w, = 1 cr, Wn 



2 2 ' " 2 2 

And in order that the collision rate be bounded, we require B(w\ — w 2 ,9) to be 
bounded. 

The Markov processes on (R 3 )^ are constructed as in the paper by Griinbaum: 

- For all pairs of indices i' 7^ j' draw Tiiji from an exponential distribution 
with parameter 7(|tv — Vj>\) 

i.e. P(2Vj/ >t) = exp(-7f) 

- Let T\ = min(Ti' j/) and = 

- Draw a <E S 2 according to law b(9ij) where cos9ij = a ■ 

- The new state after collision at time T\ becomes 

V* = V*j = Ri jt(r V = (v ll ...,v*,....,v*,...,v N ), 

with 

lu + Vi \vi-vA . Ui + Vi 



vi = — - + cr — — , v 'a = a ■ 
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The Markov process Vt is constructed by repeating the steps above but with time 
rescaled with TV so that each coordinate jumps one time per unit time, on averge. 
The law of Vt is denoted , and the corresponding semigroup , the dual semi- 
group and its generator G N . 

The master equation on the law p^ is given in dual form by 

with 

N 

(G N <p)(V) = -£ 7(|«j - Vj \) / b(fi t j) da, 
where <p = <p(V), <p% = f(V*). 

Theorem 6.3. Assume thatp Q 6 7'(R d )nil/ 1 (K d ; (v) d+5 ), p$ = p ® ■ ■ ■ <8> Po- ief 



TV times 



Pi 



N 



StiPo) be the solution of the N -particle master equation and pt = S^°(po) 
the solution of the Boltzmann equation. Then there is a constant C(k,£), depending 
only on k and £ and a > such that for N > 21, < t < T , and all 



we have 



sup 

[0,T] 



(s?(p»)-(S?(p )f N ),ip 



< C(k,£) 



IMU°= UPoIImi , aT IMIup(fl") , IM|lap(»") IbolUi, 



A r 



A r 



j\Tl/(d+4) 



Proof: The statement of the theorem is a reformulation of Theorem 16.11 and 
the proof is carried out by choosing the spaces T\ , Ti , T3 and verifying that the 
hypotheses (HI) ... (H14) hold. And this can be done with T\ = Ti = C (R d ) 
and T 3 = Lip{R d ). 

It follows that propagation of chaos holds, at least for this kind of initial data. 
Proof of (HI). We want to show that there exists C\ > such that 



VJeC 1 ' 1 ^ 1 ), \\(G N n N -ir N G 



L°°(E N ) 



C 1 > 1 (M 1 )- 



For $ e C^^M 1 ), set = £><J>[Ay] and compute 

AT 



G iV ($o^) 



1 

2A7 



W, - V, 



6(%) 



(7(T 



1 f 

— ~ Vj\) / biflij) (fly, ~ fly, ft da 

1 N if 

*j.7 — 1 



- Ay 



2 

M 1 



(=A(V)) 

do- (=/2(V)). 
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The first term, I\ is estimated with (recall that (i v = J2j=i ^ty ) 

N 



7(|« - H) 6(») + (j)(w*) - <f>(y) - 4>{w)] f4{dv) fi^(dw) da 

= (Q($, = (G°°$) (Ay) , 

and the second, I2OO as 

W = ^E£=i7(l« < -«il)/ s --iO(ll*llcM (^) 2 ) da 

<c||7lU^(E^^)<^. 

And together these yield the desired estimate. 

Proof of (H2). Set k = 6 = 1. We will show that for all /i, \i! G P(M. d ) and for 
any T > 0, there exists CV > such that 



(22) 



sup 

te[o,T] 



Af 1 



where CS^°[^\ is the linear semigroup associated the solution Sf°fj,. Hence (H2) 
holds with _F 2 = T\ = C (R d ), T' 2 = M l (M d ) when fe = 6 = 1. To prove ([22]) 
consider 

9t/« = QUuft), /o = M, 
<9*5t = Q(.9t,9t), 9o = fi', 

d t h t = Q(ft,h t ) :=Q(ft,h t ) + Q(h t ,f t ), h = go - fo = /«' - A*- 

The solutions to these equations, /, g, and /i, and = f—g—h is the remainder term 
in (|22|). and this can be estimated by a Gronwall argument to give the estimate, 
with C T ~ e CT . 

PROOF of H3: TakeJ" 3 = Lip(M d ). The Wasserstein (or Tanaka) distance between 
two measures is defined as 



Wx(jJL,v) 



inf 



y\drf(x,y)=vn£K[\X-Y\\ . 



7er(/i,i/) J ExE 

Here r(/x, f) is the collection of 7 £ V{E x £7) such that /1 = J E j(-,dy) and 
is = f E j(dx, ■). An equivalent definition is 



VFi(/i, !/) = sup 



(f>(x)(/i(dx) — v(dx)) 



Lip 



< 1 



Tanaka [151 El] proved that if pi , are the solution so of the Boltzmann equation 
with initial data Pq, Po e -P(R d ), then 



[0,T] 



Now (H3) 



V//, 1/ G P(£7) sup dist^ (5 t °°(/i), S t °°M) < dist^( M , 1/) , 

[0,T] 



RANDOM MANY-PARTICLE SYSTEMS 



39 



with F3 = Lip(R d ), follows immediatley from Tanaka's result. 

Proof of H4: Here we need to check that the dual of T3 = Lip(R d ), F' 3 satisfies 

|K($)||^v = \\$oji»\\^ N < C„ II^Iccm^) • 

For any $ G C - 1 ^), 



F $ llLip(( R ^)«) ^ SU P sw |y_y| 

< 11*11 

This implies that (H4), and hence all four hypotheses are satisfied, and Theorem 
is a consequence of Theorem 16.11 

6.6. Other examples and comments. Another model that is covered by Theo- 
rem 16.11 is the McKean- Vlasov system [35] . Here the iV-particle system is defined 
as 

dxl = a t dBi + F N {x l ,fi I l- 1 ) dt, 1 < i < N , 

with X 1 := (x\ ...,x i ~ 1 ,x i+1 , ...,x N ) and F N : R d x P(R d ) -> R d . The nonlinear 
McKean- Vlasov equation on P(M. d ) defined by 

% = Q(Pt), P(0)=Po m P(R d ), 

with 

1 d d 

qo>) = 2 E %t>& a pp)-Y;d a (F a (x,p)p). 

a, 0=1 a=l 

In this case, the hypotheses (HI) to (H4) can be verified with T\ = H 2 ,F2 = 
H S+2 ,F 3 = Lip{R d ) 

But the abstract theorem presented here does not cover e.g the Boltzmann equa- 
tion for hard spheres, i.e. the case that Grunbaum attempted to solve. A more 
detailed analysis, involving weighted spaces, is required for that. A proof is given 
in [37]. 

Another important result in |37| is that in some cases all estimates can be carried 
out uniformly in time (contrary to the estimate above, which involves constants 
that grow exponentially with the time interval). It is not at all obvious that such 
a result could be true, considering the calculations carried out in Section 14.21 For 
large times the exponential e tL will be dominated by large powers of L, and for 
any fixed N, the same variables must be reused many times, potentially creating 
correlations that remain also when N increases. For the Boltzmann equation and 
the related A-particle systems, the stationary measures to the A^-particle systems 
are themselves chaotic, and this may help getting the uniform estimates. 

However, the model of flocking described in Section 15751 does not have this prop- 
erty. It is a "pair interaction driven master equation'' which are defined in [S], where 
it is also proven that propagation of chaos holds for all times, but that the station- 
ary states for the A^-particle systems are not chaotic. Another model studied in [5] 
is called a "choose the leader model". In that model a pair interacts in such a way 
that one of the two particles (randomly chosen in the pair) tries to take the other 
particle's velocity, but makes a random error. That is also a pair interaction driven 
master equation, and in this case some calculations can be carried out rexplicitly, 
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and in particular one can find explicit expressions for the marginal distributions. 
These expressions show that the stationary states are not chaotic. 

Propagation of chaos is an important concept, and many questions remain open, 
most notably the question of propagation of chaos for a deterministic particle system 
and a rigorous derivation of the Boltzmann equation, valid over a macroscopic time 
interval. I hope that these notes have given some flavour of this and recommend 
the reader to look in the litterature for many more results. Some relevant references 

are HQ a cm HE E M El EH H3 • 
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